Principal Component Analysis on Image Sets
Principal component analysis (PCA) is a type of multivariate statistical analysis used to find associations between a large number of datasets. This can make large datasets easier to understand and can help visualize commonalities and differences across a large collection of data. Association between datasets is made by determining covariance values. As in linear regression, covariance in principal component analysis is the difference between the average product of corresponding data elements in datasets being compared and the product of the means of the dataset elements. A covariance of zero means that no statistical correlation exists between the datasets. The covariance values of n-datasets are mapped in n-space and the angle and distance separating the point clouds in this multidimensional space can be deterimined which provides useful information on how closely related the datasets are to each other. Usually, PCA is performed on data sets unrelated to images, however software is available to allow multispectral images to be analysed using this technique and n-principal component images can be produced for a dataset of n-multispectral images. Each principal component image represents the covariance plot for the axis represented by its respective dataset in n-dimensional space.
I will attempt to describe the technique using Clementine calibrated images of Tycho as an example. Clementine UVVIS data includes the following five filter images. The last image shown is a composite ratio image called a maturation ratio image (see Multispectral Imaging submenu).

415 nm 750 nm

900 nm 950 nm

1000 nm R=750/415 G=750/1000 B=415/750
To perform a principal component analysis of the five filter data, each filter image is assigned to a Cartesian axis in five dimensional space. A best fit plane is calculated through the data covariance point cloud in five space and additional planes are added orthogonally until five planes have been created. The smaller the angular distance a data point is to a particular plane, the more relevant it is. A statistical analysis of the five axis principal components for the Tycho images above is shown below:

Each of the calculated principal component images is shown below:

1st Principal Component Image 2nd Principal Component Image

3rd Principal Component Image 4th Principal Component Image

5th Principal Component Image
The relevance of the images decreases from the 1st through the 5th components. The last principal component consists mainly of noise (and this fact can be used to process the noise out of images which is a common use for this type of image analysis). However, different aspects of the geology and/or mineral composition of the lunar surface can be revealed by the principal component images.
In addition, principal component images can be assigned to the red, green, and blue channels of a color image to reveal unsuspected compositional information. The image below if a false color composite image created by assigning the 1st principal component to the red channel, the 2nd to the green channel and the 3rd to the blue channel.

This false color image provides interesting information about the mafic composition of the central peak and rim of Tycho. The principal component images and statictics were produced by FoveaPro 4.0 software which works as a set of plug-in filters in Adobe Photoshop CS-2 (see links submenu). In Phososhop CS2 the 415 nm, 750 nm, 900 nm, 950 nm, and 1000 nm Clementine images were loaded into alpha channels 1 thru 5 respectively with the R, G and B channels being deleted. All alpha channels were selected. FoveaPro was then used to create a PCA transform and then a foward PCA analysis was performed which caused the Clementine images to be replaced by the PCA images pertinent to each alpha channel. The first three alpha channels representing the first, second and third principal component images were then moved to the R, G and B channels respectively of a new image.
Alternatively, principal component images can be created using the very comprehensive TNT mips lite program made available without charge by the Microimages Corporation. The link to the download site for this software is: http://www.microimages.com/tntlite/ . This program has some restrictions on file size but is otherwise essentially as functional as the full version of TNT mips. It is an excellent program for analysis of images including multispectral and hyperspectral images, and it has an excellent statistical analysis package for application to images.
